Voice quality transformation using an extended source-filter speech model
نویسندگان
چکیده
In this paper we present a flexible framework for parametric speech analysis and synthesis with high quality. It constitutes an extended source-filter model. The novelty of the proposed speech processing system lies in its extended means to use a Deterministic plus Stochastic Model (DSM) for the estimation of the unvoiced stochastic component from a speech recording. Further contributions are the efficient and robust means to extract the Vocal Tract Filter (VTF) and the modelling of energy variations. The system is evaluated in the context of two voice quality transformations on natural human speech. The voice quality of a speech phrase is altered by means of re-synthesizing the deterministic component with different pulse shapes of the glottal excitation source. A Gaussian Mixture Model (GMM) is used in one test to predict energies for the resynthesis of the deterministic and the stochastic component. The subjective listening tests suggests that the speech processing system is able to successfully synthesize and arise to a listener the perceptual sensation of different voice quality characteristics. Additionally, improvements of the speech synthesis quality compared to a baseline method are demonstrated.
منابع مشابه
An HMM-based speech synthesiser using glottal post-filtering
Control over voice quality, e.g. breathy and tense voice, is important for speech synthesis applications. For example, transformations can be used to modify aspects of the voice related to speaker’s identity and to improve expressiveness. However, it is hard to modify voice characteristics of the synthetic speech, without degrading speech quality. State-of-the-art statistical speech synthesiser...
متن کاملOn glottal source shape parameter transformation using a novel deterministic and stochastic speech analysis and synthesis system
In this paper we present a flexible deterministic plus stochastic model (DSM) approach for parametric speech analysis and synthesis with high quality. The novelty of the proposed speech processing system lies in its extended means to estimate the unvoiced stochastic component and to robustly handle the transformation of the glottal excitation source. It is therefore well suited as speech system...
متن کاملTransforming voice quality
Voice transformation is the process of transforming the characteristics of speech uttered by a source speaker, such that a listener would believe the speech was uttered by a target speaker. In this paper we address the problem of transforming voice quality. We do not attempt to transform prosody. Our system has two main parts corresponding to the two components of the source-filter model of spe...
متن کاملGlottal source and vocal-tract separation Estimation of glottal parameters, voice transformation and synthesis using a glottal model
This study addresses the problem of inverting a voice production model to retrieve, for a given recording, a representation of the sound source which is generated at the glottis level, the glottal source, and a representation of the resonances and anti-resonances of the vocal-tract. This separation gives the possibility to manipulate independently the elements composing the voice. There are man...
متن کاملAutomatic voice-source parameterization of natural speech
We present here our work in automatic parameterization of natural speech by means of a pitch synchronous source-filter decomposition algorithm. The derivative glottal source is modelled using the Liljencrants-Fant (LF) model. The model parameters are obtained simultaneously with the coefficients of an all-pole filter representing the vocal tract response by means of a quadratic programming algo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015